Graphics processing units (GPUs) are extensively used as accelerators across multiple\napplication domains, ranging from general purpose applications to neural networks, and\ncryptocurrency mining. The initial utilization paradigm for GPUs was one application accessing\nall the resources of the GPU. In recent years, time sharing is broadly used among applications of a\nGPU, nevertheless, spatial sharing is not fully explored. When concurrent applications share the\ncomputational resources of a GPU, performance can be improved by eliminating idle resources.\nAdditionally, the incorporation of GPUs in embedded and mobile devices increases the demand\nfor power efficient computation due to battery limitations. In this article, we present an allocation\nmethodology for streaming multiprocessors (SMs). The presented methodology works for two\nconcurrent applications on a GPU and determines an allocation scheme that will provide power\nefficient application execution, combined with improved GPU performance. Experimental results\nshow that the developed methodology yields higher throughput while achieving improved power\nefficiency, compared to other SM power-aware and performance-aware policies. If the presented\nmethodology is adopted, it will lead to higher performance of applications that are concurrently\nexecuting on a GPU. This will lead to a faster and more efficient acceleration of execution, even for\ndevices with restrained energy sources.
Loading....